Efficient Rare Association Rule Mining Algorithm
نویسندگان
چکیده
Data mining is the process of discovering correlations, patterns, trends or relationships by searching through a large amount of data stored in repositories, corporate databases, and data warehouses. In Data mining field, the primary task is to mine frequent item sets from a transaction database using Association Rule Mining (ARM).Whereas the extraction of frequent patterns has focused the major researches in association rule mining, the requirements of reliable rules that do not frequently appear is taking an increasing interest in a great number of areas. Rare association rule refers to an association rule forming between frequent and rare items or among rare items. In many cases, the contradictions or exceptions also offers useful associations. Recent researches focus on the discovery of such kind of associations called rare associations. The mining of associations involving rare items is referred as rare association rule mining. Approaches to association rule mining uses single minimum support for identifying frequent associations. To mine interesting rare association rules, single minimum support approaches are not useful. Hence, we propose an algorithm based on MSApriori ,we call this new algorithm as MSApriori_VDB which uses vertical database format. Experimental results shows that our algorithms out performs previous approaches in both memory requirement and execution time by reducing the number of database scans. Keywords—Data Mining, Association rules, Rare items, Rare Association rule mining, MSAPriri_vdb
منابع مشابه
A new approach based on data envelopment analysis with double frontiers for ranking the discovered rules from data mining
Data envelopment analysis (DEA) is a relatively new data oriented approach to evaluate performance of a set of peer entities called decision-making units (DMUs) that convert multiple inputs into multiple outputs. Within a relative limited period, DEA has been converted into a strong quantitative and analytical tool to measure and evaluate performance. In an article written by Toloo et al. (2009...
متن کاملMSApriori using Total Support Tree Data Structure
Association rule mining is one of the important problems of data mining. Single minimum support based approaches of association rule mining suffers from "rare item problem". An improved approach MSApriori uses multiple supports to generate association rules that consider rare item sets. Necessity to first identify the "large" set of items contained in the input dataset to ge...
متن کاملNumeric Multi-Objective Rule Mining Using Simulated Annealing Algorithm
Abstract as a single objective one. Measures like support, confidence and other interestingness criteria which are used for evaluating a rule, can be thought of as different objectives of association rule mining problem. Support count is the number of records, which satisfies all the conditions that exist in the rule. This objective represents the accuracy of the rules extracted from the da...
متن کاملData sanitization in association rule mining based on impact factor
Data sanitization is a process that is used to promote the sharing of transactional databases among organizations and businesses, it alleviates concerns for individuals and organizations regarding the disclosure of sensitive patterns. It transforms the source database into a released database so that counterparts cannot discover the sensitive patterns and so data confidentiality is preserved ag...
متن کاملNew Approaches to Analyze Gasoline Rationing
In this paper, the relation among factors in the road transportation sector from March, 2005 to March, 2011 is analyzed. Most of the previous studies have economical point of view on gasoline consumption. Here, a new approach is proposed in which different data mining techniques are used to extract meaningful relations between the aforementioned factors. The main and dependent factor is gasolin...
متن کامل